A Robust Clustering Approach Based on KNN and Modified C-Means Algorithm
نویسنده
چکیده
Submitted: Aug 20, 2013; Accepted: Sep 28, 2013; Published: Oct 6, 2013 Abstract: Cluster analysis is used for clustering a data set into groups of similar individuals. It is an approach towards to unsupervised learning and is one of the major techniques in pattern recognition.FCM algorithm needs the number of classes and initial values of center for each cluster. These values are determined randomly, so it may cause target function converges to several local center. so many iterative stages are needed, until FCM can reach to global center for each cluster. In this paper, we suggest robust hybrid algorithm in which, we have real unsupervised learning algorithm, no need to initial center value and the number of clusters. The First layer in this algorithm finds initial clustering center by K-nearest neighbor (K-NN) rules based on unsupervised learning approach. In the second layer, we applied FCM only one time for having optimal clustering. It is done by means of Fuzzy clustering validation criterion, unlike FCM that needs iterative process. We applied new algorithm to several set of standard databases (IRIS). results show that this algorithm is more accurate than FCM both in estimation of optimal number of clusters and correctness of devotion of data to their real clusters.
منابع مشابه
Bilateral Weighted Fuzzy C-Means Clustering
Nowadays, the Fuzzy C-Means method has become one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called Bilateral Weighted Fuzzy CMeans (BWFCM). We used a new objective function that uses some k...
متن کاملA Hybrid Data Clustering Algorithm Using Modified Krill Herd Algorithm and K-MEANS
Data clustering is the process of partitioning a set of data objects into meaning clusters or groups. Due to the vast usage of clustering algorithms in many fields, a lot of research is still going on to find the best and efficient clustering algorithm. K-means is simple and easy to implement, but it suffers from initialization of cluster center and hence trapped in local optimum. In this paper...
متن کاملA Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach
In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...
متن کاملADAPTIVE NEURO FUZZY INFERENCE SYSTEM BASED ON FUZZY C–MEANS CLUSTERING ALGORITHM, A TECHNIQUE FOR ESTIMATION OF TBM PENETRATION RATE
The tunnel boring machine (TBM) penetration rate estimation is one of the crucial and complex tasks encountered frequently to excavate the mechanical tunnels. Estimating the machine penetration rate may reduce the risks related to high capital costs typical for excavation operation. Thus establishing a relationship between rock properties and TBM pe...
متن کاملUsing fuzzy c-means clustering algorithm for common lecturer timetabling among departments
University course timetabling problem is one of the hard problems and it must be done for each term frequently which is an exhausting and time consuming task. The main technique in the presented approach is focused on developing and making the process of timetabling common lecturers among different departments of a university scalable. The aim of this paper is to improve the satisfaction of com...
متن کاملAn Effective Approach for Robust Metric Learning in the Presence of Label Noise
Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...
متن کامل